Skip to content

Prebuild CLI E2E Docker image#16787

Merged
sebastienros merged 8 commits intomainfrom
sebros/prebuild-cli-e2e-image
May 6, 2026
Merged

Prebuild CLI E2E Docker image#16787
sebastienros merged 8 commits intomainfrom
sebros/prebuild-cli-e2e-image

Conversation

@sebastienros
Copy link
Copy Markdown
Contributor

@sebastienros sebastienros commented May 5, 2026

CLI E2E tests are split across many isolated GitHub Actions jobs, so each job cold-builds the same default Dockerfile.e2e image before tests start. That repeats slow and failure-prone apt/docker build work across the matrix.

This change adds a reusable workflow that builds the shared CLI E2E Docker images once, saves them as artifacts, and has Linux CLI E2E jobs load those images before running tests. The test helper supports explicit image overrides for the DotNet/Python, polyglot, and Java polyglot variants and only fails fast when the matching ASPIRE_E2E_REQUIRE_*_IMAGE variable is set, so workflows that have not opted into the prebuilt image path can still fall back to the existing Dockerfile build behavior.

The prebuild is wired into the regular split CLI E2E matrix, quarantined/outerloop specialized runs, daily CLI smoke tests, and the flaky-test reproduction workflow when TEST_PROJECT is Cli.EndToEnd. The shared artifacts now cover Dockerfile.e2e, Dockerfile.e2e-polyglot-base, and Dockerfile.e2e-polyglot-java; Podman keeps its existing variant-specific image path because it uses the privileged nested-runtime setup. There are no Rust or Go CLI E2E Dockerfile variants today.

The image build now uses BuildKit with a GitHub Actions remote cache for the DotNet/Python and polyglot base images, and retries with the default Ubuntu apt sources if the Azure mirror build fails. The default DotNet image no longer installs Java; Java CLI E2E coverage uses the PolyglotJava Dockerfile variant, so the heavy JDK layer is isolated to the Java image. Repository-dependent script copies are also ordered after the toolchain and bundle layers to preserve cache hits when source files or install scripts change independently.

Build the default CLI E2E Docker image once per workflow and load it in split test jobs through an explicit image override.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 5, 2026 17:16
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 16787

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 16787"

Wire daily smoke and flaky reproduction workflows into the prebuilt image path, and scope the helper fail-fast behavior to workflows that explicitly require the preloaded image.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@sebastienros sebastienros marked this pull request as draft May 5, 2026 17:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces redundant Docker builds across the CLI E2E GitHub Actions matrix by prebuilding the default CLI E2E Docker image once, publishing it as an artifact, and loading/tagging it in downstream Linux test jobs. It also updates the CLI E2E helper to support an explicit prebuilt image override via ASPIRE_E2E_DOTNET_IMAGE, while preserving Dockerfile-based builds for local runs.

Changes:

  • Add a reusable workflow to build/save/upload the default CLI E2E Docker image as an artifact, and wire it into the main and specialized test pipelines.
  • Update the reusable run-tests.yml workflow to download/load/tag the prebuilt image on Linux and export ASPIRE_E2E_DOTNET_IMAGE for tests.
  • Refactor CLI E2E test helpers to select between “prebuilt image” vs “build from Dockerfile”, and add targeted unit tests for that selection logic.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/Aspire.Cli.EndToEnd.Tests/Helpers/CliInstallStrategyTests.cs Adds tests validating Docker source selection behavior (prebuilt image vs Dockerfile fallback vs CI enforcement).
tests/Aspire.Cli.EndToEnd.Tests/Helpers/CliE2ETestHelpers.cs Introduces ASPIRE_E2E_DOTNET_IMAGE support and CI behavior for choosing prebuilt image vs Dockerfile build.
.github/workflows/tests.yml Adds a prebuild-image job and updates job dependencies/gating to include it where CLI E2E runs are present.
.github/workflows/tests-quarantine.yml Updates the PR path filter list to include the new reusable image build workflow.
.github/workflows/tests-outerloop.yml Updates the PR path filter list to include the new reusable image build workflow.
.github/workflows/specialized-test-runner.yml Wires the prebuild-image workflow into specialized runs and gates execution appropriately.
.github/workflows/run-tests.yml Downloads/loads/tags the prebuilt image on Linux CLI-archive runs and sets ASPIRE_E2E_DOTNET_IMAGE.
.github/workflows/build-cli-e2e-image.yml New reusable workflow that builds and uploads the default CLI E2E Docker image artifact.

Comment thread tests/Aspire.Cli.EndToEnd.Tests/Helpers/CliE2ETestHelpers.cs
Comment thread .github/workflows/build-cli-e2e-image.yml Outdated
sebastienros and others added 3 commits May 5, 2026 11:26
Use a stable prebuilt image tag for CLI E2E artifact consumers and clear Docker build args when Hex1b runs from a prebuilt image.

Add BuildKit remote cache with Ubuntu mirror fallback, remove Java from the default .NET image, and keep repository-dependent script copies after cache-stable image layers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Only clear Hex1b build args when the CLI E2E helper is using a prebuilt image without a Dockerfile path.

Keep SKIP_SOURCE_BUILD and apt mirror build args for polyglot and other Dockerfile variants.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Generalize the CLI E2E prebuilt image workflow so it produces the shared .NET/Python, polyglot, and Java polyglot images once per workflow run. CLI E2E jobs now load the matching artifacts and the helper can consume variant-specific image overrides with explicit requirements.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@sebastienros sebastienros marked this pull request as ready for review May 5, 2026 22:21
Clear variant-specific prebuilt image environment variables in tests that intentionally verify fallback to Dockerfile builds. This matches the CI environment where the shared prebuilt image variables are set globally before running CliInstallStrategyTests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Member

@JamesNK JamesNK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — the workflow orchestration, C# test helper refactoring, and Dockerfile cache optimization all look solid. Two minor comments: one potential fragile coupling in run-tests.yml (REQUIRE flag set outside the file-existence guard) and a cosmetic indent inconsistency in tests.yml.

Comment thread .github/workflows/run-tests.yml Outdated
Comment thread .github/workflows/tests.yml Outdated
Copy link
Copy Markdown
Member

@radical radical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just a few follow-up suggestions.

Comment thread .github/workflows/run-tests.yml
Comment thread .github/workflows/build-cli-e2e-image.yml
Comment thread .github/workflows/build-cli-e2e-image.yml
Copy link
Copy Markdown
Member

@radical radical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two inline comments on the test helper / test scoping. Two non-blocking points for the description / validation:

1. The new GHA BuildKit cache is silently inert. build-cli-e2e-image.yml invokes docker buildx build --cache-from type=gha --cache-to type=gha,…,ignore-error=true without docker/setup-buildx-action (or crazy-max/ghaction-github-runtime), so neither ACTIONS_RUNTIME_TOKEN nor ACTIONS_RESULTS_URL is exposed to the build step. The gha cache backend can't authenticate and ignore-error=true swallows the cache-write failure. As a result there is no cross-run cache today — the only caching that happens is BuildKit's local layer cache for that single docker buildx build invocation, which doesn't survive between workflow runs (each run gets a fresh hosted runner / fresh cli-e2e-builder driver). Within a single workflow run the build job executes once anyway and ships its output via actions/upload-artifact, so there's no intra-run cache benefit either.

This is a soft issue (the prebuild still works, just without the claimed cache benefit), so two valid resolutions:

  • Fix it: add a pinned docker/setup-buildx-action@… step before the build (and drop the manual docker buildx create --use); the action both creates the builder and exports the runtime token / cache URL needed for type=gha.
  • Or update the PR description to reflect that the optimization is per-workflow consolidation (one build job per matrix instead of N rebuilds) rather than a cross-run BuildKit cache.

2. Validation on the full quarantine + outerloop pipelines. Both workflows already auto-trigger on this PR via their paths: filters, and the most recent runs against sebros/prebuild-cli-e2e-image (outerloop run 25405631870, quarantine run 25405631888) completed successfully. I've also manually re-triggered both against the latest head:

Worth waiting on these (and any subsequent re-runs after addressing the comments below) before merge, given the change touches the prebuild plumbing for both pipelines.

Comment thread tests/Aspire.Cli.EndToEnd.Tests/Helpers/CliE2ETestHelpers.cs
Comment thread tests/Aspire.Cli.EndToEnd.Tests/Helpers/CliInstallStrategyTests.cs Outdated
Copy link
Copy Markdown
Member

@radical radical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the GHA cache point as an inline comment for visibility on the file itself.

Comment thread .github/workflows/build-cli-e2e-image.yml
sebastienros and others added 2 commits May 6, 2026 07:17
Extract prebuilt CLI E2E image loading into a shared script, document the image contract, and make Java image requirements fail fast during artifact loading. Tighten helper tests and prebuilt-image strategy validation, align workflow indentation, and initialize Buildx through setup-buildx-action so the GHA cache backend can authenticate.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Export the GitHub Actions cache runtime with the repo-approved github-script action and create the Buildx builder in the shell step so CLI E2E image cache remains active without triggering workflow startup restrictions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

🎬 CLI E2E Test Recordings — 77 recordings uploaded (commit d5f29e8)

View all recordings
Status Test Recording
AddPackageInteractiveWhileAppHostRunningDetached ▶️ View Recording
AddPackageWhileAppHostRunningDetached ▶️ View Recording
AgentCommands_AllHelpOutputs_AreCorrect ▶️ View Recording
AgentInitCommand_DefaultSelection_InstallsSkillOnly ▶️ View Recording
AgentInitCommand_MigratesDeprecatedConfig ▶️ View Recording
AspireAddPackageVersionToDirectoryPackagesProps ▶️ View Recording
AspireInitSingleFileAppHostRunsViaDotnetRunAppHost ▶️ View Recording
AspireUpdateRemovesAppHostPackageVersionFromDirectoryPackagesProps ▶️ View Recording
Banner_DisplayedOnFirstRun ▶️ View Recording
Banner_DisplayedWithExplicitFlag ▶️ View Recording
Banner_NotDisplayedWithNoLogoFlag ▶️ View Recording
CertificatesClean_RemovesCertificates ▶️ View Recording
CertificatesTrust_WithNoCert_CreatesAndTrustsCertificate ▶️ View Recording
CertificatesTrust_WithUntrustedCert_TrustsCertificate ▶️ View Recording
ConfigSetGet_CreatesNestedJsonFormat ▶️ View Recording
CreateAndRunAspireStarterProject ▶️ View Recording
CreateAndRunAspireStarterProjectWithBundle ▶️ View Recording
CreateAndRunEmptyAppHostProject ▶️ View Recording
CreateAndRunJavaEmptyAppHostProject ▶️ View Recording
CreateAndRunJsReactProject ▶️ View Recording
CreateAndRunPythonReactProject ▶️ View Recording
CreateAndRunTypeScriptEmptyAppHostProject ▶️ View Recording
CreateAndRunTypeScriptStarterProject ▶️ View Recording
CreateJavaAppHostWithViteApp ▶️ View Recording
CreateTypeScriptAppHostWithViteApp_UsesConfiguredToolchain ▶️ View Recording
DashboardRunWithOtelTracesReturnsNoTraces ▶️ View Recording
DeployK8sBasicApiService ▶️ View Recording
DeployK8sWithGarnet ▶️ View Recording
DeployK8sWithMongoDB ▶️ View Recording
DeployK8sWithMySql ▶️ View Recording
DeployK8sWithPostgres ▶️ View Recording
DeployK8sWithRabbitMQ ▶️ View Recording
DeployK8sWithRedis ▶️ View Recording
DeployK8sWithSqlServer ▶️ View Recording
DeployK8sWithValkey ▶️ View Recording
DeployTypeScriptAppToKubernetes ▶️ View Recording
DescribeCommandResolvesReplicaNames ▶️ View Recording
DescribeCommandShowsRunningResources ▶️ View Recording
DetachFormatJsonProducesValidJson ▶️ View Recording
DetachFormatJsonProducesValidJsonWhenRestartingExistingInstance ▶️ View Recording
DoListStepsShowsPipelineSteps ▶️ View Recording
DocsCommand_RendersInteractiveMarkdownFromLocalSource ▶️ View Recording
DoctorCommand_DetectsDeprecatedAgentConfig ▶️ View Recording
DoctorCommand_TypeScriptAppHostReportsMissingConfiguredToolchain ▶️ View Recording
DoctorCommand_WithSslCertDir_ShowsTrusted ▶️ View Recording
DoctorCommand_WithoutSslCertDir_ShowsPartiallyTrusted ▶️ View Recording
GlobalMigration_HandlesCommentsAndTrailingCommas ▶️ View Recording
GlobalMigration_HandlesMalformedLegacyJson ▶️ View Recording
GlobalMigration_PreservesAllValueTypes ▶️ View Recording
GlobalMigration_SkipsWhenNewConfigExists ▶️ View Recording
GlobalSettings_MigratedFromLegacyFormat ▶️ View Recording
InitTypeScriptAppHost_AugmentsExistingViteRepoAtRoot ▶️ View Recording
InteractiveCSharpInitCreatesExpectedFiles ▶️ View Recording
InvalidAppHostPathWithComments_IsHealedOnRun ▶️ View Recording
LatestCliCanStartStableChannelAppHost ▶️ View Recording
LatestCliCanStartStableChannelTypeScriptAppHost ▶️ View Recording
LegacySettingsMigration_AdjustsRelativeAppHostPath ▶️ View Recording
LogsCommandShowsResourceLogs ▶️ View Recording
OtelLogsReturnsStructuredLogsFromStarterAppCore ▶️ View Recording
PsCommandListsRunningAppHost ▶️ View Recording
PsFormatJsonOutputsOnlyJsonToStdout ▶️ View Recording
PublishWithConfigureEnvFileUpdatesEnvOutput ▶️ View Recording
PublishWithDockerComposeServiceCallbackSucceeds ▶️ View Recording
PublishWithoutOutputPathUsesAppHostDirectoryDefault ▶️ View Recording
RestoreGeneratesSdkFiles ▶️ View Recording
RestoreGeneratesSdkFiles_WithConfiguredToolchain ▶️ View Recording
RestoreRefreshesGeneratedSdkAfterAddingIntegration ▶️ View Recording
RestoreSupportsConfigOnlyHelperPackageAndCrossPackageTypes ▶️ View Recording
RunFromParentDirectory_UsesExistingConfigNearAppHost ▶️ View Recording
SecretCrudOnDotNetAppHost ▶️ View Recording
SecretCrudOnTypeScriptAppHost ▶️ View Recording
StagingChannel_ConfigureAndVerifySettings_ThenSwitchChannels ▶️ View Recording
StartAndWaitForTypeScriptSqlServerAppHostWithNativeAssets ▶️ View Recording
StopAllAppHostsFromAppHostDirectory ▶️ View Recording
StopNonInteractiveSingleAppHost ▶️ View Recording
StopWithNoRunningAppHostExitsSuccessfully ▶️ View Recording
UnAwaitedChainsCompileWithAutoResolvePromises ▶️ View Recording

📹 Recordings uploaded automatically from CI run #25441231885

@radical
Copy link
Copy Markdown
Member

radical commented May 6, 2026

Two more nits surfaced from a follow-up review pass — both low-priority, neither blocking. Tracking on #16825 alongside the existing follow-up so this PR can land:

  • Substring-based Java image gate (run-tests.yml:135-141, 151): contains(inputs.testShortName, 'Java') works today because every DockerfileVariant.PolyglotJava test class happens to start with Java. A future class using that variant without Java in its name (e.g. MavenIntegrationTests, KotlinPolyglotTests) will silently get --require-java=false, skip the artifact, and fall back to a from-source Dockerfile build — which is exactly what the strict-mode flag was meant to make loud. Different concern from Prebuild CLI E2E Docker image #16787 (review) r3192698415; that one fixed "given --require-java=true, fail loudly if the tarball is missing." The remaining gap is one layer up — the value passed to --require-java is itself substring-derived.

  • Java image build symmetry (build-cli-e2e-image.yml:84-100): build_java_image uses DOCKER_BUILDKIT=1 docker build while the other two go through docker buildx build --load. Works on GHA Linux because plain Docker Engine has no builder→buildx alias — docker build bypasses the --use'd cli-e2e-builder and falls through to the daemon's integrated BuildKit (build log: #0 building with "default" instance using docker driver). On Docker Desktop, where the alias is on by default, the same script would route through the docker-container builder, miss --load, and fail at the subsequent docker save with "No such image". Cheap to use the shared build_image helper for the Java build too.

Copy link
Copy Markdown
Member

@radical radical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sebastienros sebastienros merged commit a8edb63 into main May 6, 2026
307 checks passed
@sebastienros sebastienros deleted the sebros/prebuild-cli-e2e-image branch May 6, 2026 21:37
@microsoft-github-policy-service microsoft-github-policy-service Bot added this to the 13.4 milestone May 6, 2026
@aspire-repo-bot
Copy link
Copy Markdown
Contributor

No documentation PR is required for this change.

This PR is a CI/build infrastructure improvement that adds a reusable GitHub Actions workflow to prebuild shared CLI E2E Docker images, reducing redundant work across the test matrix. It contains no user-facing changes, new public APIs, new configuration options, or behavioral changes that affect Aspire developers or users.

Generated by PR Documentation Check for issue #16787 · ● 105.8K ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants